Acoustic Factorisation
نویسنده
چکیده
This paper describes a new technique for training a speech recognition system on inhomogenous training data. The proposed technique , acoustic factorisation, attempts to explicitly model all the factors that affect the acoustic signal. By explicitly modelling all the factors the trained model set may be used in a more flexible fashion than in standard adaptive training schemes. Since an individual model is trained for each factor, it is possible to factor-in only those factors that are appropriate to a particular target domain , for example the distribution over all training speakers. The target domain specific factors are simply estimated from limited target specific data, for example the target acoustic environment. The theory of this new approach for a particular speaker and environment transforms is described. Initial experiments on a large vocabulary speech recognition task are presented.
منابع مشابه
An explicit independence constraint for factorised adaptation in speech recognition
Speech signals are usually affected by multiple acoustic factors, such as speaker characteristics and environment differences. Usually, the combined effect of these factors is modelled by a single transform. Acoustic factorisation splits the transform into several factor transforms, each modelling only one factor. This allows, for example, estimating a speaker transform in a noise condition and...
متن کاملSpeaker and Noise Factorisation for Robust Speech Recognition
Speech recognition systems need to operate in a wide range of conditions. Thus they should be robust to extrinsic variability caused by various acoustic factors, for example speaker differences, transmission channel and background noise. For many scenarios, multiple factors simultaneously impact the underlying “clean” speech signal. This paper examines techniques to handle both speaker and back...
متن کاملMutually exclusive grounding for weakly supervised non-negative matrix factorisation
Non-negative Matrix Factorisation (NMF) has been successfully applied for learning the meaning of a small set of vocal commands without any prior knowledge of the language. This kind of learning is useful if flexibility in terms of the acoustic and language model is required, for example in assistive technologies for dysarthric speakers because they do not comply with common models. Vocal comma...
متن کاملAdaptation of deep neural network acoustic models using factorised i-vectors
The use of deep neural networks (DNNs) in a hybrid configuration is becoming increasingly popular and successful for speech recognition. One issue with these systems is how to efficiently adapt them to reflect an individual speaker or noise condition. Recently speaker i-vectors have been successfully used as an additional input feature for unsupervised speaker adaptation. In this work the use o...
متن کاملHAC-models: a novel approach to continuous speech recognition
In this paper, a bottom-up, activation-based paradigm for continuous speech recognition is described. Speech is described by co-occurrence statistics of acoustic events over an analysis window of variable length, leading to a vectorial representation of high but fixed dimension called “Histogram of Acoustic Co-occurrence” (HAC). During training, recurring acoustic patterns are discovered and as...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001